Replication of below article’s Data and Visualizations
“We keep pumping money into the NHS. Is it good value?
By Tom Calver”
Karim K. Kardous

Health Service and General Prelude

About the data/article in a nutshell: Tom uses a combination of sources to make his point → while not a bottom ‘performer’; especially compared to the US, UK’s National Health Service (NHS) did not wait for the COVID crisis to show its limits and harbor increasing consumer dissatisfaction.
That downward trend can be traced back to the early 2010’s; and that comes from the service running lean - too lean, among other reasons.
For more detail on the sources, feel free to visit the actual article with link embedded at the top of the page in to author’s name.
If you like to download the data, feel free to use urls here.

Overall Strategy for building first plot: The data for first plot is smoothed out/interpolated data; this is usually done to ‘well smooth’ the data and delineate clearer trends over time; in this case select European countries’ survey takers’ scores on their overall satisfaction level towards their respective countries’ public health services (ESS - European Social Survey).
In this case, there is an additional reason the author smoothed the data; it is to ‘complete’ the years given that the ESS is done once every two years; note that I am assuming here but it’s not an outrageous assumption to make. So in order to match that smoothing, I go with geom_smooth() from ggplot2 and keep span at default; after a few iterations; the data points (for non survey years) match highly to what Tom displays in the first graph.
Finally, since the graph is interactive, I use ggiraph package to emulate said interactivity; a JS based R package that lets you add tooltips/hover/highlight upon hover/downplay non-hovered, etc. all the usual things one expects from an interactive plot; without having to build a Shiny app; which for this exercise/first plot (and the rest); would be like building a Gatling gun to aim at an ant.

Steps taken:
While I try to be as detailed in my comments as possible; it’s still helpful to lay out the step by step process as a numbered list to get the overall chain of the code is supposed to do on a high level - without having to go into the nitty gritty- the comments in the code chunks and in that regard should hopefully help:

  1. Found the source of the data from HTML Source Page; clicked on Network tab after hovering on the plot panel; refreshed the page; and found ‘dataset’ under ‘datawrapper’.

  2. The data was wide in structure (from raw csv):
    17 columns (1 column for year, 8 hex-coded columns with imputed/smoothed values, and 8 columns for country abbreviations with survey data for even years, NULL otherwise).

  3. Discovered the hex columns and country columns didn’t align in a straightforward way. Columns were randomly ordered within each set, requiring a ranking approach rather than pairwise matching (one hex column to the symmetrical position of the country labeled column).

  4. Implemented a solution by sorting satisfaction scores per year, which helped group values by country through proximity of their scores. This approach works well since the values are interpolated through smoothing, making them very close to one another from row to row. This might not work in other cases, but it does here.

  5. Combined two sorted datasets: year + hex columns, and year + country columns to create a properly aligned mapping, joined facts data (with scores/values) on newly created mapped long datasets (converted from wide- almost always much harder to work with) to then finalize the dataset for visualization. More detail on ‘finalized the dataset’ can be found in the comments of the actual code.
    Prepared the final clean dataset for interactive plotting with ggiraph.

Health Service Satisfaction

Show the code
#|echo: false
#|message: false
#|warning: false
#|include: false

invisible({
  # install pacman if it's not already installed
  if (!requireNamespace("pacman", quietly = TRUE)) install.packages("pacman")
  # install.packages("gdtools", type = "source")
  
  ## NOTE; potential (might not be needed) steps on mac for registering then loading Roboto font into Quarto below
  
  ## from terminal/shell
  # brew install cairo fontconfig freetype pkg-config
  # export PKG_CONFIG_PATH="/opt/homebrew/lib/pkgconfig:/opt/homebrew/share/pkgconfig"
  # export PKG_CFLAGS="-I/opt/homebrew/include"
  # export PKG_LIBS="-L/opt/homebrew/lib"
  
  ## then from Rstudio
  # install.packages("gdtools", type = "source")
  
  
  pacman::p_load(
    xml2,
    downlit,
    gdtools,
    tidyverse,
    quarto,
    chromote,
    here,
    tidycensus,
    janitor,
    purrr,
    ggtext,
    ggshadow,
    ggiraph,
    gfonts,
    showtext,
    ggborderline,
    shiny,
    gt,
    rsvg,
    magick,
    stringr,
    ggimage
  )
  showtext::showtext_auto(enable = FALSE) 
}
)


# load fonts properly - using simpler approach
library(systemfonts)
library(showtext)

# roboto font properly
font_add_google("Roboto", "roboto")
showtext_auto()
# ggplot theme set
theme_set(
  theme_minimal() +
  theme(text = element_text(family = "roboto", "Roboto"))
)

While above took care of importing required libraries and setting general options such as plot theme and text font to be used; below is the start of data related tasks; from initial pull, to wrangling, to finally output the visualizations.

Data Pull

Show the code
#|echo: false
#|message: false
#|warning: false
#|include: false
#|eval: false

invisible({
  b <- ChromoteSession$new()
  b$Page$navigate("https://www.thetimes.com/comment/columnists/article/we-keep-pumping-money-into-the-nhs-is-it-good-value-blq8bxc39")
  Sys.sleep(3) # allow some time for dynamic content to render
})

# extract all iframe srcs (joined by || in this case)
iframes_html <- b$Runtime$evaluate("Array.from(document.querySelectorAll('iframe')).map(el => el.src).join('||')")$result$value
# split and filter valid Datawrapper url's
chart_urls <- str_split(iframes_html, "\\|\\|")[[1]] |>
  str_subset("^https://datawrapper\\.dwcdn\\.net/[a-zA-Z0-9]+/\\d+$")

all_data <- purrr::map_dfr(chart_urls, function(url) {  
  message("Navigating to: ", url)
  b$Page$navigate(url)
  Sys.sleep(3)
  
  html <- b$Runtime$evaluate("document.documentElement.outerHTML")$result$value
  
  # match visible chart values if any
  pattern <- 'aria-datavariables="year,\\s*([A-Z]+)".*?aria-datavalues="([0-9]{4}),\\s*([0-9.]+)"'
  matches <- str_match_all(html, pattern)[[1]]
  
  # match dataset.csv url as well 
  csv_pattern <- "https://datawrapper\\.dwcdn\\.net/[a-zA-Z0-9]+/\\d+/dataset\\.csv"
  csv_link <- str_extract(html, csv_pattern)
  if (is.na(csv_link)) {
    csv_link <- str_glue("{url}/dataset.csv")
  }
  
  tibble(
    chart_url = url,
    country = if(nrow(matches)) matches[, 2] else NA,
    year = if(nrow(matches)) as.integer(matches[, 3]) else NA,
    value = if(nrow(matches)) as.numeric(matches[, 4]) else NA,
    dataset_csv = csv_link
  )
})

# add a custom gt boilerplate -from {gt} package (great tables)-to reduce code redundancy (having to copy/paste same chunks of code every # time we turn a tibble into a gt object)
gt_nyt_custom <- function(x, title = '', subtitle = '', first_10_rows_only = TRUE){
  
  x <- x |> clean_names(case = 'title')
  numeric_cols <- x |> select(where(is.double)) |> names()
  integer_cols <- x |> select(where(is.integer)) |> names()
  
  title_fmt <- if(title != "") glue::glue("**{title}**") else ""
  subtitle_fmt <- if(subtitle != "") glue::glue("*{subtitle}*") else ""
  
  x |>
    (\(x) if (first_10_rows_only) slice_head(x, n = 10) else x)() |>
    gt() |> 
    tab_header(
      title = md(title_fmt),
      subtitle = md(subtitle_fmt)
    ) |> 
    tab_style(
      style = list(
        cell_text(color = '#333333')
      ),
      locations = cells_body()
    ) |> 
    tab_style(
      style = list(
        cell_text(color = '#CC6600', weight = 'bold')
      ),
      locations = cells_column_labels(everything())
    ) |> 
    fmt_number(
      columns = c(numeric_cols),
      decimals = 1
    ) |> 
    fmt_number(
      columns = c(integer_cols),
      decimals = 0
    ) |> 
    tab_options(
      table.font.names = c("Merriweather", "Georgia", "serif"),
      table.font.size = 14,
      heading.title.font.size = 18,
      heading.subtitle.font.size = 14,
      column_labels.font.weight = "bold",
      column_labels.background.color = "#eeeeee",
      table.border.top.color = "#dddddd",
      table.border.bottom.color = "#dddddd",
      data_row.padding = px(6),
      row.striping.include_table_body = TRUE,
      row.striping.background_color = "#f9f9f9"
    )
  
}

Metadata

Show the code
#|echo: false
#|message: false
#|warning: false

# reveal dataset urls/csvs
all_data |>
  count(
    url = chart_url, download_link = dataset_csv
  ) |>
  select(-n) |>
  gt_nyt_custom(
    title = 'Dataset Ids'
  ) |>
  cols_label(
    Url = "Plot URL",
    `Download Link` = "Link to CSV"
  ) |> 
  tab_footnote(
    "In the event you download the links yourself and run your own script,
    the third and last should be treated as tsv files, otherwise csv's"
  ) 
Dataset Ids
Plot URL Link to CSV
https://datawrapper.dwcdn.net/7NJRB/1 https://datawrapper.dwcdn.net/7NJRB/1/dataset.csv
https://datawrapper.dwcdn.net/Bxhol/4 https://datawrapper.dwcdn.net/Bxhol/4/dataset.csv
https://datawrapper.dwcdn.net/JH3Qn/1 https://datawrapper.dwcdn.net/JH3Qn/1/dataset.csv
https://datawrapper.dwcdn.net/Mc3q2/2 https://datawrapper.dwcdn.net/Mc3q2/2/dataset.csv
https://datawrapper.dwcdn.net/eXQPs/1 https://datawrapper.dwcdn.net/eXQPs/1/dataset.csv
In the event you download the links yourself and run your own script, the third and last should be treated as tsv files, otherwise csv's

Data Sample

Show the code
# reveal data sample for year 2004 as an example
all_data |> 
  filter(year == 2004) |>
  select(2:last_col()) |> 
  gt_nyt_custom() |> 
  tab_header(
    title = md("**Chart Data Summary**"),
    subtitle = md("*Extracted from embedded datawrapper in the HTML*")
  )
Chart Data Summary
Extracted from embedded datawrapper in the HTML
Country Year Value Dataset Csv
DE 2,004 4.7 https://datawrapper.dwcdn.net/7NJRB/1/dataset.csv
PT 2,004 3.5 https://datawrapper.dwcdn.net/7NJRB/1/dataset.csv
IE 2,004 4.1 https://datawrapper.dwcdn.net/7NJRB/1/dataset.csv
NO 2,004 5.7 https://datawrapper.dwcdn.net/7NJRB/1/dataset.csv
FR 2,004 5.8 https://datawrapper.dwcdn.net/7NJRB/1/dataset.csv
ES 2,004 5.8 https://datawrapper.dwcdn.net/7NJRB/1/dataset.csv
GB 2,004 5.4 https://datawrapper.dwcdn.net/7NJRB/1/dataset.csv

Data Wrangling

Show the code
#|echo: false
#|message: false
#|warning: false

# for easier referencing, assign file types (csv where appropriate, tsv otherwise) 
file_info <- tibble(
  path = unique(all_data$dataset_csv),
  name = c("health_service_sat:csv", "value_for_money:csv", 
           "room_to_improve:tsv", "barely_beds:csv", "budget_breakdowns:tsv")
  ) |> 
  separate(
    name, into = c('dataset_name', 'file_type'), sep = "\\:"
  )

# loop thru datasets, read them in, and then assign them to the global environment
invisible({
  file_info |>
    mutate(
      data = pmap(
        list(path, file_type),
        \(path, file_type) if(file_type == "csv") read_csv(path) else read_tsv(path)
      )
    ) |>
    select(dataset_name, data) |>
    deframe() |>
    list2env(envir = .GlobalEnv)
}
)

# set country 'switch'; so that tooltip can reflect full country name (spelled out) accordingly for imputed values (non survey years smoothed values)
country_labels <- c(
  NO = "Norway", DE = "Germany", ES = "Spain", 
  FR = "France", GB = "UK", IE = "Ireland", PT = "Portugal"
  )
country_label_tibble <- c(
    NO = "Norway", DE = "Germany", ES = "Spain",
    FR = "France", GB = "UK", IE = "Ireland", PT = "Portugal"
  ) |> 
  enframe()

# clean the first dataset: health_service_sat, to prep for plotting.
# it has 17 columns: 1 year column, 8 hex-coded columns (imputed/smoothed values), 
# and 8 columns for country abbreviations (survey data, even years).
# initially assumed hex columns align pairwise with country columns; but it was not the case.s
# columns are randomly ordered within each set, so we use ranking instead.
# sorting by satisfaction score per year helps group values by country (via proximity of their scores).
# we then combine two sorted datasets: year + hex columns, and year + country columns
# note that this might not always be the go-to solution but in this case, 
# and given that the values are interpolated (through smoothing), we can safely bet that the values will be very close to one another
# from one row to the next

hex_to_country_mapping <- health_service_sat |> 
  pivot_longer(
    -year
  ) |> 
  slice_max(year) |> 
  filter(
    str_starts(name, '\\#') & !str_detect(name, 'A9FF') # looking at last values from article curves, we can infer this is Italy so 
  ) |> 
  arrange(value) |> 
  bind_cols(
    health_service_sat |> 
      pivot_longer(
        -year
      ) |> 
      slice_max(year) |> 
      filter(
        !str_starts(name, '\\#') & !str_detect(name, 'IT')
      ) |> 
      arrange(value)
  ) |> 
  select(
    years = 1, hex_code = 2, second_to_last_val = 3, 
    years_max = 4, country_abb = 5, last_val = 6
  ) |> 
  mutate(
    val_diff = abs(last_val - second_to_last_val)
  ) |> 
  arrange(val_diff) |> 
  select(
    hex_code, country_abb
  ) |> 
  # also join to country_label_tibble to get full country names for future use
  inner_join(
    country_label_tibble, 
    join_by(country_abb == name)
    )
# now we can map the randomly assigned hex value labels to the actual columns/countries, and create 8 series,
# one for each country
health_service_sat <- health_service_sat |> 
  pivot_longer(
    -year
  ) |> 
  left_join(
    hex_to_country_mapping, 
    join_by(name == hex_code)
  ) |>
  mutate(
    country_abb = coalesce(country_abb, name)
  ) |> 
  inner_join(
    hex_to_country_mapping, 
    join_by(country_abb == country_abb)
  ) |> 
  select(
    year, 
    country_abb,
    country = value,
    value = value.x
  ) |> 
  drop_na() 

extract_smooth_build <- function(tibble, country = 'GB'){
  
  initial_pull <- 
    all_data |> 
    filter(country %in% {{country}}) |> 
    ggplot(aes(x = year, y = value)) + 
    geom_smooth(method = 'loess')
  
  # fetch country abbs for ids, and ranges
  country_ids <- c(na.omit(all_data |> pull(country) |> unique()))
  country_max <- all_data |> filter(country == {{country}}) |> pull(value) |> max()
  country_min <- all_data |> filter(country == {{country}}) |> pull(value) |> min()
  
  # access smoothed, include actual years to imputed/smoothed points, cap at min max per country/series
  # and keep only columns of interest
  smoothed_df <- ggplot_build(initial_pull)[[1]] |> as.data.frame() |> as_tibble()
  
  complete_series <- 
    smoothed_df |> 
    select(year = x, value = y) |> 
    mutate(country := country) |> 
    bind_rows(
      all_data |> 
        filter(country == {{country}}) |> 
        select(year, value) 
    ) |> 
    mutate(
      year = as.integer(year),
      year_val_tie_breaker = if_else(is.na(country), 1, 0)
    ) |> 
    group_by(country, year) |> 
    arrange(desc(year_val_tie_breaker)) |> 
    mutate(ties = row_number()) |> 
    filter(
      if (n() < 4) TRUE else ties + year_val_tie_breaker != 1 # make sure every year/country combo gets same no. of obs
      # and that original values (only in the event a given year is even or survey year) take precedence over smoothed ones
      # otherwise just pass/do nothing
    ) |> 
    ungroup() |> 
    # ensuer smoothed values don't go below/beyond lower/upper bounds
    mutate(
      value = pmin(pmax(value, country_min), country_max)
    ) |> 
    arrange(year) |> 
    fill(country, .direction = 'downup') |>  # since every year starts with 
    select(year, country, value) 
  
  return(complete_series)
  
}
# country vector to loop thru
country_name_abbs <- c(na.omit(all_data |> pull(country) |> unique()))
# combine all series
all_series <- map_dfr(.x = country_name_abbs, ~extract_smooth_build(tibble = all_data, country = .x))

# set contry 'switch; so that tooltip can change accordingly for odd numebred years
country_labels <- c(
  NO = "Norway", DE = "Germany", ES = "Spain",
  FR = "France", GB = "UK", IE = "Ireland", PT = "Portugal"
)

# adding year as continuous variable (decimal years) so that points don't overlap but strech over whithin a year to year span
all_series <- 
  all_series |> 
  mutate(
    rn = row_number(), .by = c(country, year)
  ) |> 
  mutate(
    decimal_year = if_else(rn == 1, year, year + rn / 8)
  ) |> 
  mutate(
    year = decimal_year
  ) |> 
  select(-decimal_year)

# also join on country full name mapping so we can generate a consolidate data_id that links the aestethics together (for interactive simultaneus highlighting, etc.)
all_series <- 
  all_series |> 
  inner_join(
    country_labels |> enframe() |> rename(values = value), 
    join_by(country == name)
  ) |> 
  mutate(
    data_id = str_c(country, values)
  ) |> 
  select(-values) |> 
  mutate(
    country_name = str_sub(data_id, 3, 20)
  )

# also generate visible (and non visible years by exclusion) as they don't visually get the same properties; visible (even numbered years) get the country abb as a tooltip (and larger markers/circles), 
# while 'invisible' ones (odd numbered years along with year 2023) get their country names fully spelled out and get transparent marker/circle fill
visible_years <- c(seq(2002, 2022, 2), 2023)

visible_points <- 
  all_series |> 
  filter(round(year) %in% visible_years & floor(year) == ceiling(year))
invisible_points <- all_series |> 
  filter(!round(year) %in% visible_years & floor(year) != ceiling(year))

# final touchups
# set color mappings
color_map <- expr(
  case_when(
    country %in% c('NO', 'Norway') ~ '#d43b45',
    country %in% c('DE', 'Germany') ~ '#DCA825',
    country %in% c('ES', 'Spain') ~ '#b01622',
    country %in% c('FR', 'France') ~ '#487caa',
    country %in% c('GB', 'UK') ~ '#264250',
    country %in% c('IE', 'Ireland') ~ '#61A861',
    country %in% c('PT', 'Portugal') ~ '#d27e4e',
    TRUE ~ '#000000'
  )
)

# set tooltip mappings
tooltip_map <- expr(
  case_when(
    !year %in% c(seq(2002, 2022, 2), 2023) & country %in% names(country_labels) ~ country_labels[country],
    TRUE ~ country
  )
)

label_data <-
  all_series |>
  group_by(country) |>
  arrange(desc(year)) |> 
  filter(row_number() == 1) |>
  mutate(
    y_offset = case_when(
      country == 'ES' ~ value + .1,
      country == 'FR' ~ value +  0,
      country == 'DE' ~ value - .05,
      country == 'GB' ~ value - .1,
      country == 'PT' ~ value + .2,
      TRUE ~ value)
  ) |> 
  ungroup() |> 
  mutate(
    country_name = case_when(
      country == "DE" ~ "Germany",
      country == "ES" ~ "Spain",
      country == "FR" ~ "France",
      country == "GB" ~ "UK",
      country == "IE" ~ "Ireland",
      country == "NO" ~ "Norway",
      country == "PT" ~ "Portugal",
      TRUE ~ NA_character_
    ),
    country_color = case_when(
      country %in% c("DE", 'Germany') | country_name %in% 'Germany' ~ "#9b6e00",  # override DE/Germany label color here since curve color is different than country label color (only one)
      country %in% c('NO', 'Norway') ~ '#d43b45',
      country %in% c('ES', 'Spain') ~ '#b01622',
      country %in% c('FR', 'France') ~ '#487caa',
      country %in% c('GB', 'UK') ~ '#264250',
      country %in% c('IE', 'Ireland') ~ '#61A861',
      country %in% c('PT', 'Portugal') ~ '#d27e4e',
      TRUE ~ '#000000'
    )
  ) |> 
  inner_join(
    country_labels |> enframe() |> rename(values = value), join_by(country == name)
  ) |> 
  mutate(
    data_id = str_c(country, values),
    country = if_else(country == 'DE', 'Germany', country)
  )

# add caption to match Tom's
caption_text <- "<span style='color:#232323;'>0 = extremely bad, 10 = extremely good</span><br><span style='color:#939291; font-weight: normal;'>Chart: Tom Calver | The Times and The Sunday Times • Source: ESS/K. Kardous</span>"

p <- 
  all_series |> distinct() |> 
  ggplot(
    aes(x = year, 
        y = value, 
        group = data_id,
        color = country)
  ) +
  scale_color_manual(
    values = c(
      'NO' = "#d43b45",
      'DE' = '#DCA825',
      'ES' = '#b01622',
      'FR' = '#487caa',
      'GB' = '#264250',
      'IE' = '#61A861',
      'PT' = '#d27e4e')
  ) +
  scale_y_continuous(
    breaks = seq(0, 7, 1), limits = c(0, 8)
    ) + 
  scale_x_continuous(
    breaks = seq(2002, 2022, 2), 
    limits = c(2002, 2025),
    expand = c(0, 0.1)
  ) + 
  theme(
    legend.position = 'none',
    panel.grid.major.x = element_blank(),
    panel.grid.minor.x = element_blank()
  ) +
  geom_smooth_interactive(
    data = all_series,
    aes(x = year, y = value, data_id = paste0(country, country_name)),
    method = "loess",
    se = FALSE,
    linewidth = 3.5, # thick line acts as the 'border'
    alpha = 1,
    show.legend = FALSE,
    color = "white"
  ) +
  # colored interactive smooth line
  geom_smooth_interactive(
    data = all_series |> filter(!country %in% 'IE'),
    aes(data_id = paste0(country, country_name)),
    method = "loess", 
    se = FALSE, 
    linewidth = 0.9, 
    fill = NA,
    show.legend = FALSE
  ) +
  geom_smooth_interactive(
    data = all_series |> filter(country %in% 'IE'),
    aes(data_id = paste0(country, country_name)),
    method = "loess",
    se = FALSE, 
    linewidth = 0.9, 
    fill = NA, 
    show.legend = FALSE
  ) +
  scale_y_continuous(
    breaks = seq(0, 7, 1), limits = c(0, 8)
    ) + 
  scale_x_continuous(
    breaks = seq(2002, 2022, 2), 
    limits = c(2002, 2025),
    expand = c(0, 0.1)
  ) +
  labs(
    x = NULL,
    y = NULL,
    caption = caption_text
  ) +
  # final touchoups before interactive rendering thru girafe()
  theme(
    panel.spacing = unit(20, 'cm'),
    plot.margin = margin(l = 5, b = 10), # leave some space/margin at the bottom for caption 'room to breathe'
    axis.text = element_text(face = "bold"), # axis tick labels
    strip.text = element_text(face = "bold"), # facet labels
    panel.grid.major.x = element_blank(),
    axis.text.x = element_text(margin = margin(b = 10, t = -10)),
    panel.grid.major.y = element_line(color = "gray90"),
    axis.ticks.x = element_blank(),
    plot.caption = element_markdown(
      family = "Roboto",
      face = 'bold'
      ) 
  ) 

p_interactive <- p +
  geom_point_interactive(
    data = visible_points,
    aes(
      x = year,
      y = value, 
      color = country,
      data_id = paste0(country, country_name)
    ),
    alpha = 0.1, fill = 'white', show.legend = FALSE
  ) +
  geom_point_interactive(
    data = 
      all_series |> 
      mutate(
        point_size = if_else(country %in% c('NO', 'Norway', 'PT', 'Portugal'), 3, 1.5),
        point_stroke = point_size
      ),
    aes(
      x = year, 
      y = value,
      data_id = paste0(country, country_name),
      tooltip = paste0(
        "<div style='text-align:", 
        if_else(year <= 2015.250, "left", "right"), 
        "; line-height: 1.1;'>", # tightens spacing
        "<div style='font-weight:bold; font-size:16px; color:",
        if_else(country_name == "Germany", "#9b6e00", eval(color_map)), 
        ";'>", 
        eval(tooltip_map), 
        "</div>",
        "<div style='font-size:16px;'>", round(year, 0), "</div>",
        "<div style='font-size:16px;'>", round(value, 2), "</div>",
        "</div>"
      )
    ),
    color = 'white',
    fill = 'white', 
    shape = 21, 
    alpha = 0
  ) +
  geom_rect(
    inherit.aes = FALSE,
    aes(xmin = 2024, xmax = Inf, ymin = -Inf, ymax = Inf),
    color = NA, 
    fill = "white"
  ) +
  scale_color_manual(
    breaks = c("GB", "FR", "IE", "PT", "ES", "NO", 'DE', 'Germany'), # this makes sure Germany as a country label gets dark yellow while 'DE' as a curve gets a darker yellow color
    values = c("#264250", "#487caa", "#61A861", "#d27e4e", "#b01622", "#d43b45", '#DCA825', '#9b6e00') # this makes sure Germany as a country label gets dark yellow while 'DE' as a curve gets a darker yellow color
  ) +
  # scale_color_identity() +  # correctly apply the country color to the label's font
  # coord_cartesian(xlim = c(2002, 2024.5)) +
  theme(
    panel.grid.major.x = element_blank(),
    panel.grid.minor.x = element_blank()
  ) +
  # add persistent white circle that follows mouse (via selection); one to cover all data points is simply assigning data to country.year combo
  geom_point_interactive(
    data = all_series,
    aes(
      x = year,
      y = value,
      group = paste0(year, country_name)
    ),
    shape = 21,
    size = 0.4,
    stroke = 1,
    fill = 'white',
    color = "grey85",
    alpha = 0,
    show.legend = FALSE
  ) 

Data Visualization

Show the code
#|echo: false
#|message: false
#|warning: false

p_ggraph_ready <- 
  p_interactive + 
  theme_minimal() +  
  theme(
    panel.grid.major.x = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  ) + 
  labs(
    caption = caption_text
  ) + 
  theme(
    plot.caption = element_markdown(
      lineheight = 1.2,
      hjust = -0.01,
      margin = margin(t = 10, l = -7, r = 1),
      halign = 0
    ),
    axis.text.x = element_text(margin = margin(t = -7, l = 3, b = 7, r = -3))  
  ) +
  geom_segment(
    aes(x = 2002, xend = 2023, y = 0, yend = 0), 
    color = 'black', 
    linewidth = .1
  ) + 
    geom_label_interactive(
    data = 
      all_series |> 
      bind_rows(
        all_series |>
          slice_max(year) |>
          arrange(desc(value)) |> 
          mutate(
            country_name = str_sub(data_id, 3, 20),
            year = 2024,
            value = c(c(7, 5.8), c(5.7, 5.4, 5.1, 4.8, 4.5) - .3)
            )
        )
     |> 
      slice_max(year) |> 
      mutate(year = 2023.4),
    aes(
      x = year,
      y = value,
      group = paste0(country, country_name),
      label =  c("Norway", "Spain", "France", "Germany", "UK", "Ireland", "Portugal"),
      data_id = paste0(country, country_name)
    ),
    label.size = NA,
    fill = NA,
    color = c("#d43b45", "#b01622", "#487caa", "#9b6e00", "#264250", "#61A861", "#d27e4e"),
    size = 3.4,
    hjust = 0,
    vjust = -.2,
    # fontface = 'bold',
    inherit.aes = FALSE,
    alpha = 1
  ) 

girafe(
    ggobj = p_ggraph_ready,
    options = list(
        opts_tooltip(
        css = "
        background: transparent;
        border: none;
        box-shadow: none;
        font-family: sans-serif;
        text-shadow:
        0 0 4px rgba(234, 255, 255, 1),
        0 0 4px rgba(234, 255, 255, 1),
        0 0 4px rgba(255, 255, 255, 1);
        line-shadow:
        0 0 4px rgba(234, 255, 255, 1),
        0 0 4px rgba(234, 255, 255, 1),
        0 0 4px rgba(255, 255, 255, 1);
        border-radius: none;
        transform: translate(-50%, 20px);
        transition: all 0.2s ease-in-out;",
        delay_mouseover = 300,
        delay_mouseout = 200
        ),
        opts_hover(
          css = "stroke-width: 3; stroke-opacity: 0.9; fill-opacity: 0.9; opacity: 1;",
          nearest_distance = 30,
          reactive = FALSE
        ),
        opts_hover_inv(
          css = "stroke-width: .2; stroke-opacity: 0.2; fill-opacity: 0.2; opacity: 0.4;"
        )
    )
)

Notes on Above Plot: I’m not sure what software Tom uses to render the interactive plots, but it was much harder than expected to replicate the persistent hollow circle marker that moves along each curve and ‘links’ the tooltip to the marker using a small vertical tick. It wasn’t for a lack of trying but I believe current ggiraph framework (I might be wrong) doesn’t natively support said functionality.
Another

Having said that, I believe the rest remains faithful to the original throughout.

Value for Money

Notes on Below Plot:

Show the code
#|echo: false
#|message: false
#|warning: false
#|include: false

theme_set(
  theme_minimal() 
)

data <- 
  value_for_money |> 
  drop_na(country) |> 
  filter(
    !country %in% c('Norway', 'Australia')
    ) |> 
  mutate(
    last_year = year == 2023,
    country_tooltip = if_else(year == 2023, country, paste(country, year, sep = ', '))
  ) |>
  arrange(country, year) |>
  mutate(
    country_fill = case_when(
      str_detect(country_tooltip, "US") ~ "US",
      str_detect(country_tooltip, "France") ~ "France",
      str_detect(country_tooltip, "Italy") ~ "Italy",
      str_detect(country_tooltip, "Germany") ~ "Germany",
      str_detect(country_tooltip, "Canada") ~ "Canada",
      str_detect(country_tooltip, "Japan") ~ "Japan",
      str_detect(country_tooltip, "UK") ~ "UK",
      TRUE ~ country_tooltip
    )
  ) |>
  # make sure to hide tooltips for most recent years as they will get an explicit data label there anywyas
  mutate(
    country_tooltip = if_else(year == 2023, '', country_tooltip)
  ) |>
  mutate(
    country = factor(country, levels = c("Canada", "France", "Germany", "Japan", "Italy", "UK", "US"))
  )

p2 <- data |>
  ggplot(aes(x = spend, y = le, color = last_year, fill = country_fill, group = country_fill)) +
  geom_point_interactive(
    aes(size = size, data_id = country_fill, tooltip = country_tooltip),
    shape = 21, alpha = 1
  ) +
  geom_text_interactive(
    data = data |> slice_max(year) |> distinct(country_fill, .keep_all = TRUE),
    aes(
      text = country_fill,
      label = country_fill,
      data_id = country_fill,
      tooltip = country_tooltip
    ),
    hjust = -0.3, vjust = 0, alpha = 1
  ) +
  scale_fill_manual(
    breaks = c("US", "France", "Italy", "Germany", "Canada", "Japan", "UK"),
    values = c("#4076A4", "#80B1E2", "#61A961", "#F5C55E", "#FFAEA9", "#DACFC0", "#E94F55")
  ) +
  scale_color_manual(
    breaks = c(FALSE, TRUE),
    values = c('white', 'black')
  ) +
  theme(
    plot.title = element_markdown(size = 12, lineheight = 1.2, linewidth = 1.5),
    plot.subtitle = element_markdown(size = 12, lineheight = 1.2)
  ) +
  labs(
    title = '**Value for money**',
    subtitle = "How life expectancy and per-capita healthcare spend have changed since 2000.<br>
               <span style='background-color:#e94f55; color:white; padding:2px 4px;'>**UK**</span> spending is rising, but life expectancy has stalled."
  ) +
  labs(x = NULL, y = NULL) +
  scale_x_continuous(
    breaks = seq(3000, 11000, 1000),
    labels = c(format(seq(3000, 10000, 1000), big.mark = ",", trim = TRUE), "$11,000")
  ) +
  coord_cartesian(
    xlim = c(2100, 11300),
    ylim = c(77, 86),
    expand = FALSE,
    clip = 'off'
  ) +
  # add caption for p2
  labs(
    caption = "<span style='color:#232323; font-weight:bold;'>In US Dollars, adjusted for purchasing power and inflation. Excludes 2020-22.</span>  <br>
<span style='color:#989799; font-weight:bold;'>Chart: Tom Calver | The Times and The Sunday Times</span>"
  ) +
  theme(
    text = element_text(color = 'black', face = 'bold'),
    panel.grid.minor = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.major = element_line(size = 0.3, color = "grey80"),
    axis.line = element_line(color = "black", size = 0.3),
    legend.position = 'none',
    plot.caption = element_markdown(
      size = 10,
      hjust = 0,
      lineheight = 1.2,
      face = 'plain'
    )
  ) +
  annotate(
    geom = 'rect',
    xmin = 2075,
    xmax = 2345,
    ymax = 86.5,
    ymin = 86.15,
    fill = '#e94f55'
  ) +
  # we also need to annotate the years 2000 and 2023 with Germany's yellow hex code (to match what Tom has)
  # not so much for Germany but for reference in general to the range of years for the plot
  # 2000 persistent text geom; for 2023 we use text geom; for 2000, we use label with no borders to bring forward '2000'
  annotate(
    geom = 'label',
    label = '2000',
    x = 4250,
    y = 77.97,
    color = '#F5C55E',
    fill = 'white',
    label.size = NA,
    fontface = "bold"
  ) +
  # 2023 persistent text geom
  annotate(
    geom = 'text',
    label = '2023',
    x = 6400,
    y = 81.2,
    color = '#F5C55E',
    fontface = "bold"
  ) +
  # add x and y axes titles (within the plot itself)
  # y axis
  annotate(
    geom = 'text',
    label = 'Life expectancy',
    x = 2685,
    y = 85.8,
    color = '#7B7B7B',
    fontface = "bold",
    fontfamily = 'roboto',
    fontsize = 15
  ) +
  # x axis
  annotate(
    geom = 'text',
    label = 'Per-capita\n  spend',
    x = 11200,
    y = 77.5,
    color = '#7B7B7B',
    fontface = "bold",
    fontfamily = 'roboto',
    fontsize = 15,
    hjust = .9,
    vjust = .6
  )

girafe(
  ggobj = p2,
  width_svg = 10, height_svg = 6,
  options = list(
    opts_tooltip(
      css = "background: white;
             border: 1px solid #ddd;
             border-radius: 4px;
             padding: 6px;
             font-size: 14px;
             font-weight: bold;
             color: #232323;
             text-align: left;
             box-shadow: 2px 2px 5px rgba(0, 0, 0, 0.1);"
    ),
    opts_hover(
      css = "stroke-opacity: 1; fill-opacity: 1; color: #232323; font-size: 12px; alpha: 1;"
    ),
    opts_hover_inv(
      css = "fill-opacity: 0.01; stroke-opacity: 0.01; color: transparent; font-size: 0.1px;"
    )
  )
)

Data Sample

Below is a sample of the data